Chinese Named Entity Recognition Method Based on Machine Reading Comprehension
LIU Yiyang1,2, YU Zhengtao1,2, GAO Shengxiang1,2, GUO Junjun1,2, ZHANG Yafei1,2, NIE Bingge1,2
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504 2. Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650504
Abstract:The existing named entity recognition methods mainly consider the context information in a single sentence, rather than the impact of document-level context. Aiming at this problem, a Chinese named entity recognition method based on reading comprehension is proposed, and the idea of reading comprehension is utilized to fully mine document-level context features to support entity recognition. Firstly, for each type of entity, the entity recognition task is transformed into a question and answer task, and a triple of question, text and entity answer is constructed. Then, the triple information is passed through BERT pre-training and convolutional neural network to capture document-level text context information. Finally, the entity answer prediction is realized through the binary classifier. The experiment of named entity recognition on MSRA dataset, People's Daily public dataset and self-built dataset shows the better performance of the proposed method and the better effect of reading comprehension on entity recognition.
[1] YADAV V, BETHARD S. A Survey on Recent Advances in Named Entity Recognition from Deep Learning Models[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1910.11470.pdf. [2] SUTTON C, MCCALLUM A, ROHANIMANESH K. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Journal of Machine Learning Research, 2007, 8: 693-723. [3] 俞鸿魁,张华平,刘 群,等.基于层叠隐马尔可夫模型的中文命名实体识别.通信学报, 2006, 27(2): 87-94. (YU H K, ZHANG H P, LIU Q, et al. Chinese Named Entity Identification Using Cascaded Hidden Markov Model. Journal of Communications, 2006, 27(2): 87-94) [4] PASSOS A, KUMAR V, MCCALLUM A. Lexicon Infused Phrase Embeddings for Named Entity Resolution // Proc of the 18th Confe-rence on Computational Natural Language Learning. Stroudsburg, USA: ACL, 2014: 78-86. [5] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural Language Processing(almost) from Scratch[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1103.0398.pdf. [6] AKBIK A, BERGMANN T, VOLLGRAF R. Pooled Contextualized Embeddings for Named Entity Recognition // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers). Stroudsburg, USA: ACL, 2019: 724-728. [7] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural Architectures for Named Entity Recognition[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1603.01360.pdf. [8] QIN Y, SHEN G W, ZHAO W B, et al. A Network Security Entity Recognition Method Based on Feature Template and CNN-BILSTM-CRF. Frontiers of Information Technology and Electronic Engineering, 2019, 20: 872-884. [9] JIA Y Z, XU X B. Chinese Named Entity Recognition Based on CNN-BiLSTM-CRF // Proc of the 9th IEEE International Confe-rence on Software Engineering and Service Science. Washington, USA: IEEE, 2018: 831-834. [10] PETERS M, NEUMANN M, IYYER M, et al. Deep Contextua-lized Word Representations // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(Long Papers). Stroudsburg, USA: ACL, 2018: 2227-2237. [11] DEVLIN J, CHANG M W, LEE K T, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers). Stroudsburg, USA: ACL, 2019:4171-4186. [12] BAEVSKI A, EDUNOV S, LIU L H, et al. Cloze-Driven Pretrai-ning of Self-attention Networks // Proc of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA: ACL, 2019: 5360-5369. [13] LI X Y, YIN F, SUN Z J, et al. Entity-Relation Extraction as Multi-turn Question Answering // Proc of the 57th Conference of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2019: 1340-1350. [14] SEO M, KEMBHAVI A, FARHADI A, et al. Bidirectional Attention Flow for Machine Comprehension[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1611.01603.pdf. [15] WANG Z G, MI H T, HAMZA W, et al. Multi-perspective Context Matching for Machine Comprehension[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1612.04211.pdf. [16] WANG S H, JIANG J. Machine Comprehension Using Match-LSTM and Answer Pointer[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1608.07905.pdf. [17] XIONG C M, ZHONG V, SOCHER R. DCN+: Mixed Objective and Deep Residual Coattention for Question Answering[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1711.00106.pdf. [18] SHEN Y L, HUANG P S, GAO J F, et al. ReasoNet: Learning to Stop Reading in Machine Comprehension // Proc of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2017: 1047-1055. [19] CHEN D Q, FISCH A, WESTON J, et al. Reading Wikipedia to Answer Open-Domain Questions // Proc of the 55th Annual Mee-ting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2017: 1870-1879. [20] LIN T, GOYAL P, GIRSHICK R, et al. Focal Loss for Dense Object Detection // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2999-3007. [21] LI J, SUN A X, HAN J l, et al. A Survey on Deep Learning for Named Entity Recognition[C/OL]. [2020-03-25]. https://arxiv.org/pdf/1812.09449.pdf.